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ABSTRACT 

We present a model for merger-driven evolution of the mass function for massive galaxies and their 
central supermassive black holes at late times. We discuss the current observational evidence in favor 
of merger-driven massive galaxy evolution during this epoch, and demonstrate that the observed 
evolution of the mass function can be reproduced by evolving an initial mass function under the 
assumption of negligible star formation. We calculate the stochastic gravitational wave signal from 
the resulting black- hole binary mergers in the low redshift universe (z < 1) implied by this model, and 
find that this population has a signal-to- noise ratio as much as ~ 5x larger than previous estimates 
for pulsar timing arrays, with an expectation value for the characteristic strain h c (f = lyr -1 ) = 
5.8 x 10 -15 that is already in tension with observational constraints, and a 2a lower limit within 
this model of h c (f = lyr -1 ) = 2.0 x 10~ 15 . The strength of this signal may therefore be detectable 
with the data already collected using the current generation of pulsar timing arrays, and could be 
detected with high statistical significance under conservative assumptions within the next few years, 
if the principle assumption of merger-driven galaxy evolution since z = 1 holds true. For cases where 
a galaxy merger fails to lead to a black hole merger, we estimate the probability for a given number 
of satellite unmerged black holes to remain within a massive host galaxy, and interpret the result in 
light of ULX observations. In particular, we find that the brightest cluster galaxies should have 1-2 
such sources with luminosities above 10 39 erg/s, which is consistent with the statistics of observed 
ULXs. 

Subject headings: black hole physics — gravitational waves — relativity 



1. INTRODUCTION 

The growth of supermassive black holes (SMBH) at the 
centers of massive galaxies is thought to have occurred 
primarily at large redshifts, with the cessation of quasar 
activity at z < 2 signaling the end of SMBH growth as 
well (see iNataraianl (|2004f ) for a review). More specif- 
ically, SMBHs are thought to grow primarily through 
gas accretion at higher redshifts; gas-rich galaxy merg- 
ers would trigger the accretion episodes through which 
SMBHs would grow, with the actual merging of multi- 
ple SMBHs being a subdomin a nt mode of SMBH g rowth 
ijVolonteri fe Nataraianl l2009t iTreister et alj l201Clh . At 
lower redshifts, gas depletion would result in a greater 
relative importance of merging black holes to SMBH 
mass growth. This epoch is associated with lower rates of 
growth in SMBH mass density overall. However, merg- 
ers can redistribute the mass density at late times (z < 1), 
leading to significant growth of the most massive black 
holes, even in the absence of observable amounts of gas 
accretion. The increased importance of mergers at late 
times, if true, will have two inescapable consequences: 
the stochastic gravitational- wave background from merg- 
ing SMBHs will be larger than otherwise expected, and 

stmcwill@princeton.edu 



any black holes that fail to merge will spiral through their 
host galaxy indefinitely, accreting gas and stars and emit- 
ting X-rays as they go. 

Recent evidence indicates that, although these mostly 
"red and dead" massive galaxies undergo little star 
formation at late times, they continue to merge un- 
abated, so that mergers dominate the evolution of 
the black- hole mass function for large masses at z<l. 
Both observations (iBell et alj|2006t iRobaina et"ai] |2010t 



Ivan Dokkum et all 120101; iTruiillo et all 120111) and de 



tailed simulations l|Naab et al.H2009HOser et al.ll2010H in- 
dicate that massive galaxies generally double their stel- 
lar mass and consequently their black hole mass since 
z = 1 primarily through mergers with M*/M. > 1/10, 
where the black hole of mass from the satellite galaxy 
merges with the host black hole of mass through dy- 
namical friction, stellar hardening (or some alternative 
process), then gravitational wave emission, and the gas 
and stars from the satellite are deposited in the outer 
reaches of the host galaxy. 

The anomalously large optical luminosity observed in 
the brightest cluster galaxies (BCGs) has long suggested 
that mergers play a critical role in their evolution, pro- 
viding another indication of the importance of merg- 
ers for massive galaxies. The Bautz-Morgan classifica- 
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tion ([Bautz fe Morganlll970ft of galaxy clusters depends 
primarily on the magnitude difference, Ami 2, between 
the first and second brightest galaxies in a cluster; the 
large size of this di fference in s ome clusters had been a 
noted anomaly (seelAbell| (|1965D and references therein). 
lTremaine"fc Richstone (119771) invented a statistical test 
that implicated a physical process, rather than a statis- 
tical variation, as the cause of the large value of Awi2- 
lOstriker fe Hausmanl (|1977l) and lHausman fc Ostrikerl 
(|1978f) argued that a plausible origin was "galactic can- 
nibalism" driven by the process of dynamical friction, 
which would cause a massive satellite galaxy to spiral 
into the center of a cluster and merge with the BCG re- 
siding there. Many subsequent studies have affirmed the 
likelihood of this scenario. 

In this work, we will discuss the observable conse- 
quences of the fact that these merging massive galaxies 
will generally contain SMBHs. The satellite black holes 
will be detectable as accreting X-ray sources as they or- 
bit within the cannibal BCG, and then as gravitational 
wave sources if they reach the more massive black hole 
at the center of the BCG. We will demonstrate that the 
observed evolution of the galaxy and black-hole mass 
functions can be reproduced by a calculation that as- 
sumes only mergers drive the evolution. By comparing 
the calculation with the observed evolution, we can find 
the total number of mergers necessary to reproduce the 
observations. With this number in hand, we perform 
a Monte Carlo simulation of the population of merging 
black holes, from which we can directly extract the total 
expected gravitational wave signal. Our results, which 
are the first to be derived primarily from observational 
datgQ, indicate that the signal within the frequency band 
of pulsar timing arrays (PTAs) is substantially larger 
than previously estimated, and is in fact at the threshold 
of detectability for the current generation of PTAs. Fur- 
thermore, the signal extends to higher frequencies than 
previous estimates, potentially reaching the lowest sen- 
sitive frequencies for space-based gravitational wave ob- 
servatories. In addition to the galaxy mergers that yield 
gravitational wave signals, many galaxy mergers will re- 
sult in what we will call a "stalled" satellite, where the 
smaller black hole cannot make its way to the host core 
in less than a Hubble time. We calculate this popula- 
tion as well, and compare their quantity and expected 
characteristics to those observed in ultraluminous X-ray 
sources (ULXs). 

In Sec. [2j we will describe our model for the black- 
hole mass function, first presenting the explicit redshift- 
dependent expressions based primarily on observational 
data in Sec. I2.H then discussing our construction of a set 
of evolution equations for propagating an initial mass 
function to later redshifts without explicitly enforcing 
any redshift dependence in Sec. 12.21 We calculate the 
dynamics of the black holes after galaxy merger to de- 
termine whether the black holes will merge in Sec. [3J 
We then apply our model to first calculate the stochas- 
tic gravitational-wave signal from merging SMBH bina- 
ries in Sec. IH elaborating on the results presented in 



1 After completion of this manuscript, we became aware of an in- 
dependent ca lculation of the gravitational-wave signal from PTAs 
l|Sesanall2012l ). which also relies on the observed galaxy mass func- 
tion. 



iMcWilliams et al.l (|2012ft . We then use our model to 
predict the number of stalled satellite black holes from 
failed mergers and their potential appearance as ULXs 
in Sec. [5j We summarize and conclude in Sec. El 

2. THE MASS FUNCTION 

2.1. The Schechter Parameters 

Any attempt to calculate the stochastic background 
of gravitational waves from SMBHs requires a model for 
the the mass and redshift distribution of galaxy mergers. 
In order to anchor our model in reality, we wish to cal- 
ibrate it using current observations of the the mass and 
redshift distribution of galaxies, typically referred to as 
the galaxy stellar mass function. The information con- 
tained in observations is condensed by fitting the data to 
established models of the mass function, which are gen- 
erally empirical. The stellar bulge and black hole mass 
functions are frequen tly parameterize d using a Schechter 
function of the form (|Schechterlll976f) . 



(j> (M) dM = <p M a exp(-M) dM , 



(1) 



where 4>dM = (dn/ dM) dM is the comoving number 
density of either black holes or bulges with masses be- 
tween M and M + dM, with M = M./M. for black 
holes or M+/M.+ for bulges. /A, (p, and a are the three 
Schechter parameters, and the subscripts "•" and 'V re- 
fer to black holes and stellar bulges, respectively. How- 
ever, Eq. [T] fails to accurately reproduce the observed 
mass function when BCGs are included. In order to rep- 
resent both BCGs and less massive galaxies (subscripted 
"low"), we combine Eq . Q] wi th a G aussian component 
representing BCGs ([Lin et al.ll2010t l. 

<j> (M) dM = 0i ow + <j) BC G) dM = Lp M a exp(-M) dM 
1 /2.51ogM\ 2 
~2 ^ G M 



+ ipexp 



- 1 



dM , (2) 



where the single new parameter, <jm, determines the e- 
folding of the high-mass, BCG-populated portion of the 
mass function. We use ctm = 0.35 at all redshifts, since 
there are no apparent physical grounds for <tm to vary, 
and this value guarantees the inclusion of ultramassive 
black holes like M87 within our distribution. We em- 
phasize that recent observational results indicate that 
simple scaling relationships like M,-M+ may systemati- 
cally underestimate the mass of black holes in BCGs b y 
a factor as large as 10 (Hlavace k-Larrondo et al.ll2012f) . 
so we would be equally justified in enhancing the high- 
mass portion of the distribution in a similar fashion by 
including a mass dependence in the proportionality con- 
stant of the M m -M± relation. To avoid ambiguity, we will 
use (f>tot = cf> to represent the full mass function in any 
expressions that include both the full function and the 
individual components. 

Since observational data are generally available for 
stellar bulges rather than for massive black holes di- 
rectly, we relate the Schechter parameters for the cen- 
tral black holes to the parameters for their concomitant 
stellar bulges by assuming M = M, = 1. 6 x 10~ 3 M± 
([Haring fe Rbd l200l lYu fc Tremainel 12001 , 
a*, and ip = ip, = </?*? which are the only choices con- 
sistent with negligible star formation and negligible ac- 
cretion of gas onto the central black holes. There are a 
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number of other corollaries for merger-driven evolution: 
the black- hole mass-stellar-bulge mass (M.--M*) relation 
and the black- hole and stellar-bulge densities, p, and p*, 
are all redshift independent; A4(z — 0) > M(z > 0); and 
a(z = 0)>a(z > 0). 

We note that the observational data on the redshift de- 
pendence of the Schechter parameters have substantial 
variance, which provides the motivation for developing 
theoretically-motivated dependencies to inform choices 
for our model. For instance, although we assume a con- 
stant Mm-M.,, relationship, the literature varies wildly, us- 
ing M, rj 10~ 3 Mt(l -I- z) a with a as large as 2, although 
larger values are typically drawn from fits to data that 
span to higher redshifts. For A4, we use 



M 



1.2 x 10 8 

T+z 



• 



(3) 



Though consistent with the obser vational data indicat- 
ing mass-doubling since z = 1 (jRobaina et al.l 120101 : 
van Dokkum et al.l 120101). this choice of redshift de- 



pendence also makes sense in light of our assumption 
of merger-driven evolution, since massive galaxies with 
M* ~ (D(A4+) should become more massive at lower red- 
shifts as they consume less massive satellites. 
We choose a constant value for ip, 



<p = 3 x 10~ 3 Mpc~ 



(4) 



which is consistent with the literature (|Cole et al.ll200lD 
at low redshifts when the final Wilkinson Microwave 
Aniso tropy Probe (WMAP ) cosmological parameters are 
used (jKomatsu et al.l [201 1|) . and is also consistent with 
merger-driven evolution, except in the unlikely scenario 
that most mergers occur between nearly equal mass bi- 
naries. Finally, to find a, we first integrate the mass 
function for total comoving mass density, 



Mn.dM = MipT [a + 2] = (p.) = const. (5) 



We take (p.) to be 6 x 10 5 M W /Mpc 3 (|Yu fc Tremaind 
120021: iGraham fc Driverll2007l) . and solve for the redshift 
dependence of a that satisfies mass conservation. We 
expand the resulting a to O (1/(1 + z)), and find that 



-2.0 



0.52 



(6) 



is accurate to within a percent for < z < 1. We note that 
the redshift dependenc e of a agrees reasonab ly well with 
the literature (see e.g. iKaiisawa et al.l ([2009)); however, 
a constant tp is generally inconsistent with observational 
data in the literature, which typically measures tp to be 
a decreasing function of redshift. We suspect sampling 
effects may account for this to some degree, with galaxies 
evolving from blue to red, and effectively "appearing" in 
a survey at lower redshifts, thereby increasing the appar- 
ent number density of massive red galaxies. In contrast, 
our approach conserves, by construction, the mass den- 
sity of old stars. 

The principle theoretical reason for choosing a constant 
tp is that mergers involving massive galaxies are unlikely 
to be equal mass mergers, given the rarity of black holes 
with M, ~ A4 t . To intuit why this implies a constant ip, 
and what behaviors we might expect for M. and a, we 
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Fig. 1. — Evolution of the black- hole mass function (equivalently, 
the stellar mass function using M+ = 625 M m ) from z I. to z 0. 
The dotted (black) curve shows the initial z = 1 mass function, 
and the solid (gre en) and dashed (red) curves show t he an alytic 
(described in Sec. 12. U and evolved (described in Sec. 12.211 mass 
functions, respectively, at z = {2/3, 1/3, 0}. 

can use a very simple model of galaxy evolution solely 
for the purpose of gaining intuition about the evolution 
of the Schechter parameters. Instead of using linear-in- 
mass bins for the mass function, we can imagine that 
mass-doubling occurs only through equal mass mergers, 
whereas mass doubling of high-mass bins occurs through 
the depletion of the lowest-mass bins in the minor-merger 
scenario. For the major-merger scenario, after doubling, 
every log-mass bin will simply shift toward larger mass 
by one log-mass unit and will halve their number density, 
doubling M., halving ip, and leaving a constant. How- 
ever, for the minor-merger scenario, the number of very 
massive galaxies remains fixed while they accrete smaller 
satellites, so that after mass doubling, massive number- 
density bins will simply shift by one log-mass unit to- 
ward higher mass at fixed number density, whereas low- 
mass bins will shift toward higher mass with their num- 
ber density being depleted by mergers with more massive 
galaxies, and replenished by mergers between lower mass 
members. If we require that the distribution maintains a 
single power-law mass distribution for less massive galax- 
ies, then in the minor-merger scenario the low-mass bins 
must have their number density depleted relative to the 
high-mass bins, in which case the slope of the mass func- 
tion at low masses would evolve toward shallower, less 
negative a values as we evolve toward lower redshift. 

2.2. Evolution of the Mass Function 

In this section, we investigate how closely the observed 
evolution with redshift of the mass function can be repro- 
duced with a simple model that assumes merger-driven 
evolution over the redshift interval z< 1. The evolution 
of the mass function generally results from some com- 
bination of mergers, star formation, and mass loss (e. g. 
through winds): 











{(f> dM) dz 



d 



dz 



(cpdM) dz 



d 
"dM 



(7) 

where {jff) ~ is the mean rate of change of mass 
at fixed number densities. To find the contribution from 
mergers, (<pdM) dzj , we need to know the number 
of mergers between a black hole with .M-normalized mass 
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between M' and M' + dM' and another with normalized 
mass between M" and M" + dM" at a redshift between 
z and z + dz. This requires some care, since BCGs will 
not merge with each other, so they must be handled sep- 
arately. Since we are assuming merger-dominated evo- 
lution, the mass dependence of the merger probability 
distribution will be proportional to the number densities 
of the two merging black holes. The redshift dependence 

I 



low, BCG} 



dM'dW'dz 



dM'dM"dz = P{z)dz i 



P(z) can be chosen to maximize the agreement between 
between our model and the observed evolution of the 
mass function. We therefore express the number density 
of mergers between black holes of .M-normalized mass 
M' = M'./M T and M" = M'JM t at redshift z (where 
M v = (l + z)M is the redshifted characteristic black hole 
mass) as 



{tot, BCG 



} {M')dM' 4> {tot , Xavt }{M")dM" 



(8) 



where in the notation </>{ aj b}i the first arguments all form 
an equation for </> a , and the second arguments form an 
equation for D - In addition, P{z) can be separated into 
a purely cosmological component and a comoving merger 

probability component as P(z) = d ^ z ^ z , ^7- is the 

merger probability in a fixed comoving volume, n{z) is 
the total number density at a given redshift, and dV c /dz 



J 



I 

accounts for cosmological expansion and is given by 

dVc 
dz 



4ircD 2 L 



h {i + z f^n M {i + zf + n K 



(9) 



We assume a flat cosmology with WM AP seven-year 
value s for the cosmological parameters ()Komatsu et al.l 
1201 ID ; therefore, the luminosity distance is given by 



Dl(z) 



c(l + z) 



dz 1 



(10) 



/o ^/n M (i + z')* + n A 

(see [Hogg] (|1999f ) and references therein). We can then 
And the number of black holes that leave (the sink term 
"si" ) and enter (the source term "so" ) a mass bin at M. 
during the interval dz, 



' d 

^^{low.BCG} ( M ) 

d - ' 

7^0{low,BCG} (M) 



= P{z) / 0{tot, BCG} (M> {tot , low} (M') dM' 

i Jl0 6 M o /A1 r 

l-(M-W 6 M Q )/M r <.(A/-10 6 M e )/M, 

= P(z) / / S(M- M' - M")0 { iow. bcg} (MO dMV{iow, low} (M") dM" 

J10 e M (3 /M I Jl0 6 M e /M, 
HA/-10 6 M Q )/M r 

= P(Z) 0{low.BCG}(MO0 { low,low}(M-MOdM'. (11) 

JlO 6 M e /M, 



We will consider the mass range 10 6 M < M, < 10 11 M Q 
throughout this work, although the precise choices do 
not significantly affect any of the results that we present. 
All that remains is to determine whether 4rr can be wcll- 
approximated (and replaced in our evolution equation) 
by a constant value. We find that fixing = 264 in our 
evolution equation reproduces the redshift-dependence of 
the mass function using our redshift-dependent Schcchtcr 
parameters remarkably well (see Fig.[l}. 

It is noteworthy that, despite the good agreement 
between our model based on galaxy evolution through 
mergers and our adopted Schcchter-based mass function, 
the evolved solutions can only approximate Eq. [2j The 
analytic form of the source term involves generalized hy- 
pergeometric functions among others, so that the evolved 
mass function does not simplify to the functional form 
of Eq. [2] The shape of the transition from less massive 
black holes to the BCG component as well as the expo- 



I 

nential decay for very large masses in the evolved model 
depend sensitively on the location of the break mass A4, 
which sets both the "knee" location and the e-folding, 
so a sufficiently fine redshift sampling (or continuous in- 
tegration, as we have done) is necessary to accurately 
approximate the target mass function at each redshift. 

3. SATELLITE BLACK HOLE DYNAMICS WITHIN 
THE HOST GALAXY 

Having established a method for calculating the dis- 
tribution of galaxy mergers, we must now determine 
whether those mergers lead to the merger of the concomi- 
tant SMBHs and thereby contribute to the gravitational 
wave background. In the case of major mergers, the stel- 
lar bulges and their central black holes will usually merge 
in less than a Hubble time. We loosely define major 
mergers as having mass ratios q = M s /M h > 1/4, where 

M h is the larger "host" mass, and M s is the "satellite" 
mass (either black hole or bulge). For minor mergers, 
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the satellite black hole will likely be stripped of its stellar 
bulge, which is deposited outside the host bulge and ex- 
plains the exceedingly l arge increase in the appar ent size 
of galaxies since z = 1 (jvan Dokkum et aLll2010h in this 
model. However, the satellite black hole must be able to 
migrate to the central core of the host under dynamical 
friction by the present day to contribute to the gravita- 
tional wave signal. To determine whether this occurs, we 
use the wel l-known expression fo r the dynamical friction 
timescale (jChandrasekharl 119431 : iTremaine et al.l Il975fl 
in a c onvenient form (Eq. 8.12 in iBinnev fe Tremaind 

Call), 

t 19Gyr f R e V a{R e ) 10 8 M Q 

DF ln(l + \5kpc7 200km/s M s 

? (6.9-ln«) \uPM e J ( + ' ' 1 ' 

where R e is the half-light radius of the host galaxy and 
er is the local velocity dispersion, for which we use 

/ M \ 0,73 
fl e = 2.5kpc(j^H {l + z)- lM mi& 

f M \ °- 2 
^ e ) = 190Ws^j (l + zf A \ (13) 

where the mass-dependence c omes from fits to S loan Dig- 
ital Sky Survey (SDSS) data (<Nipoti et al.ll2009D . and the 
redshift dependence comes from fits to simulation results 
w hich were shown to be consistent with various surveys 
in lOser et ail (|2010D within the redshift range we con- 
sider. 

We note that Eq. (|T2"j) differs in multiple ways from 
what is often used in merger-tree models (see Eq. 9 in 
IVolonteri et al.l ()2003[ )). Specifically, it is often assumed 
that £df depends only on the mass ratio of the merging 
pair, i. e. dynamical friction does not depend on the mass 
scale. This would be a very surprising result for multiple 
reasons. The appropriate £df for merger-trees should in- 
corporate both the merging dark matter halos and, sub- 
sequently, the stellar bulges. The mass of the host and 
the inclusion or exclusion of its stellar bulge will affect its 
equilibrium density distribution, and will therefore affect 
the tidal stripping of the satellite. Even if the inclusion 
of the bulge does not affect t pF significantly as argued in 
IBovlan-Kolchin et al.l ((2008), the radius of injection for 
the satellite and the orbital kinematics of the satellite 
should depend directly on the host mass. Since we need 
an expression for tuF that applies for £df ~ in order to 
determine whether the host and satellite will merge, the 
scenario of a "naked" satellite black hole moving through 
the host bulge by dynamical fraction is precisely the case 
of interest. Therefore, in Eq. (Tl2")) . the relevant mass ratio 
is M£/M%, rather than M^/M^ as is oft en used in dark 
matte r merger trees, such as the model of lVolonteri et all 
(2003) which has been employed frequently for gravita- 
tional wave calculations. Finally, in the scenario of a 
stripped satellite, the satellite black hole is effectively 
deposited at R e within the host galaxy, and spirals to- 
ward the center through dynamical friction. R e is far 
smaller than the virial radius, which is the fiducial dis- 
tance scale used in dark-matter merger-tree simulations 



(|Volonteri et aLlfeOOSl) . where the merger of two halos is 
considered, and the bulges are ignored when calculating 

*DF- 

At each redshift, we use Eq. (fT2|) to determine which 
black holes will merge. In order to calculate the total 
gravitational wave signal from the merging population 
using the Monte Carlo method, we need to sample from 

,4 

the probability distribution dMld ^[ 2dz d f ■ Eq. [8] provides 
most of what we need, since ,, Af d ,,f f i = — rr -rn^ra ■ 

' dMxdM^dz n{z) dMxdM^dz 

The probability of finding a binary in a given frequency 
interval df is found by observing that the quadrupole 
frequency rate / oc J 11 / 3 , so that ^ oc / _11 ^ 3 , with the 
normalization being determined by the range of possible 
frequencies. 

It is known that dynamical friction becomes ineffective 
once the binary reaches a separation of 0(1 pc), so we 
must assu me a mechanism for solving t he "last parsec 
problem" (Me rritt fc Milosavrjevid l2005') in order to de- 
termine the minimum frequency where gravitational ra- 
diation wfil_drive the evolution of the binary. Following 
IQuinlanl (1996), we assume the binary hardens through 
the repeated scattering of stars in the core of the host, 
until gravitational radiation becomes the dominant pro- 
cess at 

f - 2 TxlO^Hz ( M ' M ' ( M >+ M > \° 2 

(14) 

The maximum frequency is set to the innermost sta- 
ble circular orbit (ISCO) for a test particle orbiting a 
Schwarzschild black hole with the same total mass, 

/ ma ^/ IS co = lTxlO-Hz(^-) ,(15) 

though we emphasize that the results are insensitive to 
the precise choice of / max , since we perform a random 
Monte Carlo draw from an /~ n / 3 distribution, so only 
/mm plays a critical role. Using a Monte Carlo sample 
to set the starting frequency, we invert 

/(*) = i.4xio-hz (WW"*) 3 '* (m±±mlV 

M ' V M. h A/. s t m -tj \2x 10 8 Mq J 

(16) 

to find the corresponding time before merger, then reap- 
ply Eq. (|16[) to find the frequency after a T = 5 year 
observation, where t m approximates the time of the final 
merger. We note that all of the frequency expressions 
given above are appropriate for the rest frame of the 
source, f r , and must be redshifted to give the measured 
gravitational wave frequency, / m , using /,„ = /,./ (1 + z). 
With the range of permitted frequencies now determined 
by the dynamics of each binary, we can generate grav- 
itational wave signals from each, and combine them to 
form our estimate of the stochastic gravitational wave 
background. 

4. GRAVITATIONAL WAVES FROM SMBH 
MERGERS SINCE Z = 1 

In order to calculate the total gravitational wave sig- 
nal from all SMBHBs in the PTA band, we must com- 
bine our knowledge of the merger probability distribu- 
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tion 



d 3 p 



... , , , , and the total number of mergers N with 

dMxdM^dz G 

the signal strength of each source, given in units of di- 
mensionless strain h. If the resulting span of frequencies 
A/ = fit") — fit — T) < T _1 for a given source, then we 
assign a gravitational wave strain amplitude appropriate 
for a monochromatic source, which is given wi t hin t he 
quadrupole approximation by Eq. 58 of iThornel (|1987t) , 



h = 3.2 x 10 



-17 



2 x 10 8 M 







M.M* 

(10 8 Mq 

1/3 



Ml 1 + M. s 



lGpc 
D L 



f 



10" 7 Hz 



2/3 



,(17) 



where this amplitude is already averaged over source po- 
larization. Sources with A/ > T _1 are potentially re- 
solvable, and are not part of the stochastic background, 
since they introduce a correlation in the signal between 
adjacent frequency bins. In addition, sufficiently nearby 
sources may be sufficiently bright to be resolved even if 
they are monochromatic, but these sources would also 
need to be removed before the data could be considered 
stochastic. Therefore, we exclude both of these types of 
sources from the current study. We note that these re- 
solvable sources, although much rarer than sources con- 
tributing to the stochastic signal, will nonetheless domi- 
nate the total energy density in gravitational waves, and 
are the only means by which we might gain knowledge 
about individual binary systems. Resolvable sources are 
therefore of great interest as well; however, since they re- 
quire a much larger and more computationally expensive 
Monte Carlo sampling due to their rarity, we leave them 
for future work. 

Finally, we calculate the full stochastic gravitational 
wave signal by summing the power of each source, 
thereby approximating 



hlif) = 



where 



dz 



N 



10 11 M 



dNU 



10 6 M 



dz 



M 2 



dMiNh' 



d 4 p 



10 6 M 



dM 1 dM 2 dzdf 
(18) 



10 11 Mq 



dM 



10 6 Mq 



d 



(19) 



is the total number of mergers during the full observa- 
tion, and the frequency minimum and resolution are set 
assuming a 5 year observation. Due to the extremely 
large number, O(10 n ), of mergers required to duplicate 
the observed evolution of the mass function, and the fact 
that low mass black holes cannot be excluded a priori 
since they dominate the overall number density, we have 
performed a Monte Carlo for Nmc = 5 x 10 6 binaries, 
and scaled the result appropriately. To scale, we first fit 
the Monte Carlo result to the form 



h Q exp 



J_ 

/df 



-2/3 



1 



f_ 

fo 



(20) 



where h a , /df, /o, /?, and 7 are tunable parameters. The 
/~ 2 / 3 ter m is t he usual spectru m for stochastic binaries 
(|Phinnevll200lt Uaffe fe Bac ken l2003l). The fin al term in 
Eq. (1201) is the same one used in iSesana et all (|2008l) and 
accounts for the discrete nature of the sources and its 
effect at high frequencies where the number of sources 



diminishes. The exponential term is introduced here for 
the first time, and is primarily motivated by the sup- 
pression of high mass sources due to dynamical friction 
(i. e. the M^-de pendence of £qf in Eq. (1121) ). which was 
also observed in lKocsis fe Sesanal ([201 ID . We note that 
the mass dependence of / m j n in Eq. (fT4"|) can also lead 
to biasing against massive merging systems, so that the 
mechanism for solving the last parsec problem can also 
influence this region of the spectrum. We will use the pa- 
rameterization of Eq.(2D]to first fit, then scale the results 
from our Monte Carlo simulation. 
After calculating the best- fit parameters from Eq. (|2U)) , 

we multiply the resulting f Q by (A/Amc) 3 ^ 11 (since 
N oc dp/df oc J" 11 / 3 as previously mentioned), and 
we multiply the overall amplitude of hat by \J N /Nmc 
(equivalently, we multiply h Q by (A/Amc) 7 ^ 22 , since 

hat oc h a fo^ 3 for / -C / ). We find the lowest frequency 
bin at which the Monte Carlo simulation only predicts 

a single source, we again multiply by (N/Nmc) 3 ^ 11 , and 
we terminate the scaled fit at that frequency, since the 
signal is clearly no longer stochastic. The parameters 
/df, fl, and 7 remain fixed in our rescaling, since there 
is no apparent physical grounds for these parameters to 
depend on N, and empirically we find that fixing these 
values from one Monte Carlo provides a good fit for other 
simulations with different values of N. 

Fig. [^demonstrates the effectiveness of the overall scal- 
ing procedure by comparing two different Monte Carlo 
simulations. One simulation has Nmc — 5 x 10 6 , and 
the other has JVmc = 10 5 . We show the raw strain data 
from both simulations, along with the best fits of each 
simulation using Eq. (|2"0"j). We then scale the simula- 
tion with A^mc = 10 5 according to the scaling rules we 
have described, in order to compare with the actual fit 
of the Nmc = 5 x 10 6 simulation. Despite the scatter in 
the data, particularly at large frequencies, the scaled fit 
'agrees extremely well with the larger Monte Carlo fit, in- 
dicating that the scaling of the mean gravitational- wave 
strain is very robust. 

In Fig. El a Monte Carlo with A M c = 5 x 10 6 scaled 
to the actual number of expected binary sources (N = 
4.75 x 10 11 using Eq. [T9|) is shown, where the scaled-fit 
values (i. e. our estimate for h c that we would expect 
to observe from massive merging binaries at z<l) are 
ho = 8.4 x 10~ 16 , f = 2.5 x 10~ 6 Hz, / DF = 6.3 x 10~ 9 
Hz, 7 = —1.3, and /3 = —4, with the final frequency 
cutoff occurring at 6.2 x 10~ 6 Hz. We also estimated 
our uncertainty in h c as follows. We first assumed a 
±10 % uncertainty in log(A4), (p } log(/0.), and a, consis- 
tent with experimen tal ranges (|Drorv & Alvarez 2008; 
iKajisawa et al.ll2009f L We then ran Monte Carlo simula- 
tions for all 8 variations, and calculated best-fit parame- 
ters for h c using Eq. (j20|) . However, since all analyses of 
PTA data to-date have assumed a simplified form for the 
spectrum of the stochasti c background, which we label 
h s and which is given by (|Jaffe fc Backer! [2003h 
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(21) 



we must convert our range of hat functions to a range 
of functions of the form given by Eq. (|2T|) . We therefore 
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Fig. 2. — Gravitational wave strain demonstrating the effective- 
ness of scaling our Monte Carlo results to cases with a larger num- 
ber of sources. The (green) triangular markers show the result of 
a Monte Carlo where TVmc = 10 5 , and the (blue) circular markers 
show another Monte Carlo wit h A f;jc = 5 X 10 6 . We fit the signal 
from the smaller sample to Eg. 1201 resulting in the (purple) dashed 
line, then scale the resulting amplitude, reference frequency, and 
final frequency to N = 5 X 10 6 as discussed in the text. The result 
is the (red) dotted curve, which provides a very accurate repre- 
sentation of the Monte Carlo result from the larger sample. To 
demon stra te this, we also fit the signal from the larger sample us- 
ing Eg. 1201 with the (black) solid line being the result, which agrees 
well with the scaled fit to the signal from the smaller Monte Carlo 
sample. 

approximate the expected and lower-limit strain spectra 
by requiring that h s < (/ifit) an d h s < min(/ifi t ), respec- 
tively, for /<yr _1 using all 9 Monte Carlo realizations, 
and we likewise require that h s > max(/ifi t ) for the upper- 
limit strain spectrum. In this way, the uncertainty re- 
gion shown in Fig. [3] encompasses the full range of spec- 
tra resulting from the 95% confidence interval of best- 
fit Schechter functions to observational data. Because 
hfn is the trend of the Monte Carlo results by construc- 
tion, the large fluctuations around the trend evident in 
Fig. [5] are not included in the uncertainty interval esti- 
mate shown in Fig. [3J Given the large number of total 
sources, and their concentration at /^yr" 1 , we do not 
expect the variance a of the spectra to significantly affect 
the constraint for the particular Monte Carlo realization 
produced by nature, since a(f) ~ h c (f)/N(f). 

In additi on to our es ti mate for h Cl we show the 
results of ISesana et al.l (|2008f ) . which agrees with 
ISesana fe VecchicT i 20101) . The large discrepancy between 
these results and our expected strain spectra (our esti- 
mate is 3.3x larger at / = 0.2 yr _1 and 5.1 x larger at 
yr _1 ) requires an explanation, so we will address this 
disagreement before we discuss the remainder of Fig. [3J 
ISesana et al.l (|2008T ) construct dM d d M d using the meth- 



— — ; dM i dM->d z " 

ods described in iVolonteri et al.l (120031): specifica l ly, the 
extended Press-Schechter formalism Bon d et all (fl99l 
is used to construct a dark-matter-halo merger tree, 
which is populated with black holes using the M, — a* 
or M, — M+ relationships, with the baryon dynamics 
treated as we described in our disc ussion of dynamica l 
friction and the last parsec problem. ISesana et al.l (120081) 
also e mploy other models than that in IVolonteri et al.l 
(2003), but these models differ in ways that are not 
relevant to our discussion (such as details in the way 
that feedback mechanisms are included), and the re- 



sults from all of the models considered in ISesana et al.l 
(2008) agree with each other far m o re tha n they agree 
with our results. Scsa na fc Vecchiol (|2010l ) use a differ- 
ent method altogether, yet arrive at a nearly identical 

result. They use dM f d 2i 2 d z ^ populating the catalog 
of merged galaxies constructed in iBertone et al.l (|2007l ) 
with black ho les based again on the M. — o+ or M. — M* 
relationships. Bert one et al.1 (120071) used the Millennium 
simulation (jSpringel et al.ll2005h to determine the merger 
rates of dark matter halos, and used semi-analytic tech- 
niques to approximate detailed baryonic physics such 
as star formation, accretion, stellar and supernova wind 
feedback, and the eff ects of metallicity. However, while 
IBertone et al.l (|2007l ) do not explain in detail their treat- 
ment of baryon interactions during galaxy mergers, they 
do state that the "outcome" of a galaxy merger depends 
only on the mass ratio in their model, and no mention is 
ever made of black holes stripped of their stellar bulge. 
Therefore, we conclude that the treatment of dynamical 
friction and the last parsec problem is qua l itative ly no 
different from that used in IVolonteri et afl (|2003h . and 
therefore suffers from the same potential shortcomings 
for cases where the satellite black hole is stripped of its 
bulge. 



In addition to issues with the treatment of satel- 
lite dynamics within the host galaxy, t wo other fac- 
tors t hat a re shared by the t r eatme nts in ISesana et al.l 
(2008) and S esana fe Vecchiol (|2010f) contribute to their 
disagreement with our result; the first issue relates to 
the initial conditions of the dark matter simulations 
that are used, and the second issue arises from the 
nontrivial relationship between halo merger rates and 
gal axy merger rates. The Millennium simulation used 
by ISesana fc Vecchiol ([2010) took the observed fluctua- 
tions intli£_r20wer^e£tnam from the first year of WMAP 
data iBennett et al.l (|2003l) and used the data to deter- 
mine the initial distribution of d ark matter particles at 
z = 127. T he model presented in IVolonteri et al.l (j2003l ) 
and used in lSesana et al.l ()2008l ) predates WMAP, so that 
the abundance of X-ray emitting c luster s in the local 
universe as measured in I Eke et ail (|1996l) was used to 
normalize a theoretical fit to the CMB power spectrum 
(|Bardeen et al.l [l986t iSugivamal 119951) . In both cases, 
the amount of power contained in high-^ multipole mo- 
ments/small angular scales was underestimated in com- 
parison to results from the co mpleted seven year WMAP 
survey (|Komatsu et al.|[201lD . Since the black holes that 
we are interested in observing affect the power spectrum 
on small angular scales, this could have a significant im- 
pact on the total number of black holes, and therefore 
the number of mergers, that these models would predict. 

Finally, we consider the relative merger rates of dark 
matter halos with and without cores of baryonic mat- 
ter (i. e. the stellar component of the galaxy). From 
Eq. (TT2|) . we see that the dynamical friction timescale 
idf oc M h /M s . If a satellite halo is spiraling through 
a large host halo, as would be the case in a galaxy 
cluster, then the time it will take for the satellite to 
spiral to the center of the cluster scales inversely with 
M s . The size and mass of the satellite halo will be lim- 
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Fig. 3. — Gravitational wave strain and strain sensitivities for a 5 year integration. The strain signals are orientation-averaged, and the 
sensitivities are sky-averaged. The dotted (black) line shows the scaled Monte Carlo results for the stochastic signal using our fiducial 
mass function parameters (i. e.our prediction for the true gravitational-wave foreground from merging massive black holes at 2<1), and 
the (cyan) shaded region so our estimate of the possible range of stochastic signals based on uncertainties in our model and assuming a 
conventional / -2 / 3 spectrum, with the median signal shown as a thick solid (blue) line, and the optimistic and c onservative limits shown 
with thin dash-dotted (blue) lines. The thin solid (green) line shows the signal predicted in Scsana ct al. (2008), which is ~ 5x weaker 
than our median estimate at / = 1 yr _1 . The PTA sensitivities are scaled to match the 95% confidence limit of current PTAs and a 
future instrument like the SKA with 10 X better timing accuracy, where the current and future sensitivities are shown as dashed (purple) 
lines with lesser and greater sensitivity, respectively, and the typical PTA reference frequency of / = 1 yr — 1 is shown as a vertical thin 
dashed (black) line. We note that the expected signal essentially intersects the current PTA limit at / = 1 yr _1 , indicating that much of 
the parameter space for the stellar mass function has been excluded in our model, and that a detection in the near future is a prediction 
of our model. Finally, the dash-dotted (red) lines show an optimistic estimate of LISA'S and SCO's low frequency sensitivity (less and 
more sensitive, respectively), which indicates that SMBHB mergers at z< 1 may be a potential source of detectable stochastic gravitational 
waves below / ~ 10~ 5 Hz for space-based gravitational wave detectors. 



ited by tidal stripping, which will be more severe in the 
absence of a stellar core. The remaining halo, though 
still larger than the stellar core, is far less dense, so the 
core may dominate the total satellite halo mass. There- 
fore, the inclusion of the stellar core could greatly de- 
crease tdf and thereby increase the merger rate. By 
using the observed evolution of galaxies to calibrate 

dMidMzdz ' our m0( iel automatically includes the correct 
baryonic physics, whereas including the correct physics 
semi-analytically in dark matter simulations is far from 
trivial. Therefore, the primary cause for the difference 
between our merger rate and the rate found in prior pub- 
lications is that we estimate the merger rate using the 
observed density of massive galaxies, whereas most prior 
estimates have been based on dark-matter merger trees. 

Returning to our description of Fig. [3J we also show 
the rms strain sensitivity h ms , averaged over sky loca- 
tion, for the current European/NANOGrav/Parkes PTA 
(using the European constraint and assuming 100 ns tim- 
ing accuracy for all pulsars), and the Square Kilometer 
Array (10 ns accuracy), assuming all arrays observe 20 
pulsars for 5 years (see iSesana et al.l (|2008l ) for details 
on calculating PTA sensitivities). We note that our ex- 
pected strain spectrum of h = 5.8 x 10~ 15 very nearly 
equals the 95% conf idence limits of h < 6 x 1 0~ 15 from 
the European PTA ()van Haasteren et al.ll20lil ) and h c < 



7x10" 15 from the NANOGrav PTA (jDemorest et al.l 
2012). Therefore, it is likely that a rigorous analysis of 
actual PTA data using a model that better represents 
the behavior of the Monte Carlo results will already ex- 
clude large regions of the parameter space of Schechter 
parameters, assuming the overall validity of our merger- 
dominated assumption. Furthermore, our most conserva- 
tive estimated level for the stochastic background would 
provide a detection with the SKA monitoring 20 pul- 
sars for 5 years at a far greater s tatistical sign i ficanc e 
than the expected strain level from ISesana et all ([2008) ; 
ISesana fc Vecchiol (l2010h . If our model is valid, the SKA 
should not be necessary for a detection, as continued ob- 
servation of the current set of pulsars with the current 
level of timing accuracy has a high likelihood of yielding 
a detection long before the SKA begins collecting data. 

We also show the rms sensitivity for both the original 
Laser Interferometer Space Antenna (LISA) design, and 
a revised lower cost design (SGO Mid) currentl y under 
consid eration by NASA, both calculated using iLarsonl 
(120031) w i th the specifications for each design found in 
Stcbbinsl (|2011l ) (a design with near-identical sensitiv- 
ity called NGO is also being considered by the European 
Space Agency ) . We show se nsitivity estimates calculated 
directly from ILarsonl ([2003), although we note that the 
sensitivity of these designs bel ow ~ 3x 10~ 5 Hz is highly 
uncertain, and the results from lLarsonl (|2003f) are likely a 
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best-case scenario. In addition to the stronger predicted 
signal within the PTA band, we predict the potential 
for significant signal within the LISA/SG O band a s well , 
if the lowest frequency sensitivities from iLarsonl ()2003[ ) 
are realized. This would provide a new class of signal 
for space-based gravitational wave observatories, partic- 
ularly concepts with longer detector ar ms and l o wer fr e- 
quency sensitive bandwidths (see e.g. iFolknerl (|201lD L 
beyond the expected observation of individual sources 
sweeping through the band. 

We note that the gravitational wave signal we have 
estimated does not truncate abruptly as might be as- 
sumed from Fig. [3] Rather, the individual sources with 
A/ > T _1 that we mentioned earlier will sweep through 
the band toward higher frequencies. They will no longer 
constitute a stochastic signal, but will have a significant 
impact on the event rate of resolvable sources. Given 
the small fraction of total mergers represented in our 
Monte Carlo, we do not adequately represent these more 
rare "chirping" signals. A much larger sample, or limit- 
ing the parameter range from which the sample is taken, 
would allow us to better represent the full merger popula- 
tion and predict the event rate for individually resolvable 
merger events, which we plan for future work. In addi- 
tion to the expectation of an increased merger rate for 
SMBHs, the increased galaxy merger rate implies that a 
large number of satellite black holes that fail to merge 



with their host, and may be observable as ULXs. This 
provides a second definitive prediction for our model, and 
will be explored in the next section. 

5. STALLED BLACK-HOLE MERGERS AND ULXS 

We have noted that, for very massive galaxies such as 
BCGs, all but the most comparable mass galaxy merg- 
ers stall prior to black hole merger. This has little effect 
on the total gravitational wave signal, since it is domi- 
nated by major and near-major mergers. However, the 
possibility of stalled mergers, where a smaller satellite 
black hole resides in the outer regions of a massive host 
galaxy, is an interesting candidate as a model for ULXs. 
Given the rarity of very massive galaxies, the probabil- 
ity of a merger with a comparable-mass galaxy is small. 
Therefore, very massive galaxies will often retain some of 
the black holes from merged satellite galaxies, since they 
will be unable to reach parsec-scale separations from the 
host black hole through dynamical friction in less than 
a Hubble time. For an extremely massive system such 
as M87 with = 6.6 x 10 9 M Q , even very massive 
satellites with < 5 x 10 9 M Q would stall according 
to Eq. [T2J so essentially any merger would result in a 
stalled satellite. We can calculate the expected num- 
ber of satellites for a given host mass, (N sat (M^)} = 
N sat 

. total 

{M?)/N hoat (M*), where 



iVsat, total (M. h ) = / dzP(z) 



M-W b M Q 



10 6 M 



cj> (AT) low (M. h - M') 9 [t H - t DF (Af Af* - M', z)] dM' , (22) 



Q(t) is 1 for t > and otherwise, and 



N host (Mi l 



1 , dVc/dz ,, 
dz — -i— 4>{M,) 



n(z) 



(23) 



Since JV sa t will be Poisson-distributed, calculating 
(N sa t(Mi)) allows us to calculated the full distribution 
of stalled satellites for a given host. We show the result 
in Fig. 21 including the expectation value and the middle 
three quintiles of the distribution. 

It is interesting to note that the mean number of satel- 
lites for a massive host galaxy is ~ 1-2, consistent with 
the observed population of ULXs. That these satellites 
would be found in the outer regions of the host galaxy, 
and would likely be accompanied by enhanced star for- 
mation due to the interaction of the host and satellite 
bulges, is also consistent with the observed character- 
istics of ULXs. It has been noted that multiple ULXs 
are preferentially found in actively merging galaxies, so 
identified through mo rphology and kinematics (see e. g. 
iKaaret fc~ Fcng (2009|) and references therein). Because 
ULXs exceed the Eddington luminosity, Le, for a 20 M Q 
black hole, which is the largest mass expected through 
typic al stellar evolutionary channels (|Fryer fc Kalogeral 
120011 ) , intermediate-mas s black holes have been proposed 
as ULX central engines ([Colbert fc Ptakll2002t ). 

In the noteworthy case of M82, which is likely to have 
undergone a recent merger, it has been suggested that 
the ULX M82 X-l may be a low- luminosity active galac- 
tic nuclei powered by a intermediate-mass black hole 
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Fig. 4. — The expected number of satellite black holes, N sa t, for 
a given host black-hole mass is shown with a solid (blue) line. 
The central three quintiles of the probability distribution for -/V aa t 
are shaded and bound by dotted (blue) lines. The dashed (black) 
line shows the mass of M87 for reference. 



(IMBH) fr om the cor e of a dwarf galaxy that M82 can- 
nibalized (King 120041 ) . Alternatively, it has been sug- 
gested that ULXs are simply low mass X-ray binaries 
that are either accreting beyond the Eddington limit, 
or whose emission is being beamed. However, SMBH 
satellites, spiraling toward the host core and accreting 
at the Bondi-Hoyle-Lyttleton rate, would naturally ex- 
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plain the luminosity of ULXs without resorting to super- 
Eddington accretion, or the as-yet-uncertain existence of 
IMBHs. This possibility has not been widely considered, 
given the common assumption that sufficiently massive 

I 



satellites would sink to the host core in less than a Hub- 
ble time. However, as we have shown, this is far from 
the case in general, and especially so for BCGs. The 
luminosity of a satellite black hole is given by 



L B = r]M B c 2 = 47T77 (cGM.f PooV - 3 = 3 x 10 39 erg/s ( ^ 



Mi 



2.2 x 10 5 M 



100 cm- 



190km/s 



(24) 



r 



where r\ is the efficiency of converting accreted gas into 
luminosity, M B is the Bondi-Hoyle-Lyttleton accretion 
rate, and n are the background gas and particle num- 
ber density, respectively, and v 



1 + c 2 s is the 

total relative velocity of the satellite and the background 
gas, with v, the velocity of the satellite with respect to 
the mean gas flow, a the local velocity dispersion, and c s 
the local sound speed. We have normalized the mass so 
that Lb matches the bolometric Eddington luminosity 
of a 20 M Q black hole. 

In Fig. [SJ we show the tail of the luminosity function 
for M87, the most massive nearby example of a BCG. 
We follow the convention of assuming that an X-ray lu- 
minosity of 10 39 erg/s for a 20 M Q black hole implies a 
total bolometric luminosity beyond the Eddington limit. 
We note that, under this assumption, M87 contains ~ 2 
ULXs, whose luminosity could be explained by the pres- 
ence of a SMBH satellite, although their luminosities 
are less extreme than M82 X-l, and could also be ex- 
plained by an unusually massive stellar-mass black hole 
or by marginally super-Eddington accretion. In the case 
of M82 X-l, the observed X-ray luminosity of ~ 10 41 
erg/s could be explained by the presence of a Sagittar- 
ius A*-sized satellite black hole spiraling through M82 
under dynamical friction. Any ULX could also poten- 
tially be explained by the aforementioned models of a 
super-Eddington or beamed emission from a stellar-mass 
black hole, or an IMBH accreting near the Eddington 
limit. There is as-yet no unambiguous way to distinguish 
these scenarios from our supermassive satellite scenario; 
for example, since the emission mechanism is unclear, 
we cannot link variability to mass as conclusively as is 
done with active galactic nuclei. Furthermore, although 
quasi-periodic oscillations are observed in some ULXs, 
they are not a clear indicator of a binary companion, 
and could arise gen e rically in black hole accretion (see 
e.g. iDolence et all potl ). Our scenario has the ad- 
vantage of corroborating observational evidence in the 
form of merging galaxies and enhanced star formation 
in galaxies hosting ULXs, as well as requiring only con- 
ventional astrophysics in the form of dynamical friction 
and Bondi-Hoyle-Lyttleton accretion to explain the ob- 
servations. The abundance of ULXs may therefore be an 
additional indicator of the merger-driven nature of the 
evolution of the mass function for z < 1. 

6. CONCLUSIONS 

We have demonstrated that the mass function for mas- 
sive galaxies since z = 1 is consistent with merger- 
driven evolution. Massive galaxies, though rarer than 
their smaller counterparts, are critical when consider- 
ing overall luminosity, star formation rates, gravitational 
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Fig . 5. — The luminosity distribution for M87 \ J ordan et al.l 
2004) is shown with a solid (blue) line. The Eddington luminosity, 
Le, of a 20 Mq black hole, and the Bondi-Hoyle-Lyttleton lumi- 
nosity, Lb, of a 10 7 Mq black hole are given by the dashed (black) 
lines. The number of ultraluminous X-ray sources is consistent 
with the number of SMBH satellites expected in M87, as shown in 
Fig.H 



wave emission, etc. since the majority of total black 
hole mass is contained in individual black holes with 
masses above ~ 5 x 10 8 M© at z = (according to 
the mass function used in this work). Furthermore, de- 
spite the domination of the mass function by the number 
of smaller galaxies, our results indicate that the mass- 
weighted mean mass ratio between a host ga laxy and its 
satelli te is ~ 1/4, whi c h is c onsistent with iNaab et al.l 
(|2009() and lOser et al.l (|2010D . The novel expectation 
of at least one, and potentially several mergers with a 
comparable-mass companion for these m assive galaxies 
(jTrujillo et al.l 120111 : iHopkins et al.l 120101 ) has dramatic 
implications for gravitational wave observations. We 
have shown that if this new paradigm of merger-driven 
evolution holds true, then the gravitational wave signal 
from the stochastic population of merging SMBH bina- 
ries is much stronger than previously expected, exceeding 
previous signal-to-noise estimates by a factor of ~ 2-5, 
depending on the choose of parameters in the initial mass 
function. Indeed, our mean estimate of h a > 5.8 x 10~ 15 
nearly equals the 95% confidence limit the European 
PTA (6 x 10" 15 ) and the NANOGrav PTA (7 x 10~ 15 ), 
suggesting that PTAs may detect gravitational waves in 
the very near future.. 

In addition to an expected increase in black hole merg- 
ers, our assumption of merger-driven galaxy evolution 
also implies an increase in failed black hole mergers. For 
sufficiently massive host galaxies and/or sufficiently dis- 
parate mass ratios between the host and satellite, dy- 
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namical friction will fail to deliver the satellite to the 
core of the host. These stalled satellites would be ob- 
served as luminous X-ray sources. We find that, for the 
brightest cluster galaxy-component of the mass function, 
1-2 of these stalled satellites would be expected, with a 
large majority of hosts having 0-3 satellites. We have 
shown that this result is consistent with existing obser- 
vations of ultraluminous X-ray sources, such as those 
found in M87 and, notably, M82 X-l, although the na- 
ture of these sources is not yet known conclusively and 



will require further study. Nonetheless, a validation of 
our model through the observation of gravitational waves 
with pulsar timing arrays in the immediate future would 
also strongly suggest that at least some of these merging 
satellites would stall, and would therefore be observable 
as ultraluminous X-ray sources. 



We thank Cole Miller, Priya Natarajan, and Alberto 
Sesana for helpful feedback on this manuscript. 
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